Searching and Summarizing in a Multilingual Environment

نویسندگان

  • Michal Toman
  • Josef Steinberger
  • Karel Jezek
چکیده

Multilingual aspects have been gaining more and more attention in recent years. This trend has been accentuated by the global integration of European states and the vanishing cultural and social boundaries. The ever increasing use of foreign languages is due to the information boom caused by the emergence of easy internet access. Multilingual text processing has become an important field bringing a lot of new and interesting problems. Their possible solutions are proposed in this paper. Its first part is devoted to methods for multilingual searching, the second part deals with the summarization of retrieved texts. We tested several novel processing techniques: a languageindependent storage format, semantic-based indexing, query expansion or text summarization leading to faster and easier retrieval and understanding of documents. We implemented a prototype system named MUSE (Multilingual Searching and Extraction) and compared its qualities with the state-ofthe-art search engine – Google. The results seem to be promising; MUSE shows high correlation with the market-leading products. Although for our experiments we used Czech and English articles, the main principle applies to other languages as well.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MUMIS – A Multimedia Indexing and Searching Environment

We describe in this paper the MUMIS Project (Multimedia Indexing and Searching Environment)1 and show the role linguistically motivated annotations, coupled with domain-specific information, can play for the indexing and the searching of multimedia (and multilingual) data. MUMIS develops and integrates base technologies, demonstrated within a laboratory prototype, to support automated multimedi...

متن کامل

An extensible approach to high-quality multilingual type setting

We propose to create and study a new model for the micro-typography part of automated multilingual typesetting. This new model will support quality typesetting for a number of modern and ancient scripts. The major innovations in the proposal are: the process is refined into four phases, each dependent on a multidimensional tree-structured context summarizing the current linguistic and cultural ...

متن کامل

Transculturation and Multilingual Lives: Writing between Languages and Cultures

This paper looks at the issues of transculturation as explored in auto and semi-autobiographical accounts of linguistic and cultural transitions. The paper also addresses a number of questions about the structure of these texts, the authors’ linguistic competences, as well as questions about the theoretical and conceptual tool which may help us to discuss the issues the writers are reflecting o...

متن کامل

Summarizing Multilingual Spoken Negotiation Dialogues

We present the multilingual sum marization functionality for Verb mobil a speech translation system We reuse resources of the system to create a summary After content ex traction we interpret the results in the dialog context A summary gen erator provides the input to genera tion A rst evaluation indicates the feasibility of the approach

متن کامل

The Automatic Generation of Formal Annotations in a Multimedia Indexing and Searching Environment

We describe in this paper the MUMIS Project (Multimedia Indexing and Searching Environment)1 , which is concerned with the development and integration of base technologies, demonstrated within a laboratory prototype, to support automated multimedia indexing and to facilitate search and retrieval from multimedia databases. We stress the role linguistically motivated annotations, coupled with dom...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006